AITopics | labeled and unlabeled data

Collaborating Authors

labeled and unlabeled data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

InterLUDE: Interactions between Labeled and Unlabeled Data to Enhance Semi-Supervised Learning

Huang, Zhe, Yu, Xiaowei, Zhu, Dajiang, Hughes, Michael C.

arXiv.org Artificial IntelligenceMar-15-2024

Semi-supervised learning (SSL) seeks to enhance task performance by training on both labeled and unlabeled data. Mainstream SSL image classification methods mostly optimize a loss that additively combines a supervised classification objective with a regularization term derived solely from unlabeled data. This formulation neglects the potential for interaction between labeled and unlabeled images. In this paper, we introduce InterLUDE, a new approach to enhance SSL made of two parts that each benefit from labeled-unlabeled interaction. The first part, embedding fusion, interpolates between labeled and unlabeled embeddings to improve representation learning. The second part is a new loss, grounded in the principle of consistency regularization, that aims to minimize discrepancies in the model's predictions between labeled versus unlabeled inputs. Experiments on standard closed-set SSL benchmarks and a medical SSL task with an uncurated unlabeled set show clear benefits to our approach. On the STL-10 dataset with only 40 labels, InterLUDE achieves 3.2% error rate, while the best previous method reports 14.9%.

interlude, labeled and unlabeled data, learning, (11 more...)

arXiv.org Artificial Intelligence

2403.10658

Country:

Europe > United Kingdom (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > France (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Probabilistic Modeling for Face Orientation Discrimination: Learning from Labeled and Unlabeled Data

Neural Information Processing SystemsApr-6-2023, 17:26:20 GMT

This paper presents probabilistic modeling methods to solve the problem of dis(cid:173) criminating between five facial orientations with very little labeled data. The first model maintains no inter-pixel dependencies, the second model is capable of modeling a set of arbitrary pair-wise dependencies, and the last model allows dependencies only between neighboring pixels. We show that for all three of these models, the accuracy of the learned models can be greatly improved by augmenting a small number of labeled training images with a large set of unlabeled images using Expectation-Maximization. This is important because it is often difficult to obtain image labels, while many unla(cid:173) beled images are readily available. Through a large set of empirical tests, we examine the benefits of unlabeled data for each of the models.

face orientation discrimination, labeled and unlabeled data, probabilistic modeling, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.65)

Add feedback

Comparing the Value of Labeled and Unlabeled Data in Method-of-Moments Latent Variable Estimation

Chen, Mayee F., Cohen-Wang, Benjamin, Mussmann, Stephen, Sala, Frederic, Ré, Christopher

arXiv.org Machine LearningMar-3-2021

Labeling data for modern machine learning is expensive and time-consuming. Latent variable models can be used to infer labels from weaker, easier-to-acquire sources operating on unlabeled data. Such models can also be trained using labeled data, presenting a key question: should a user invest in few labeled or many unlabeled points? We answer this via a framework centered on model misspecification in method-of-moments latent variable estimation. Our core result is a bias-variance decomposition of the generalization error, which shows that the unlabeled-only approach incurs additional bias under misspecification. We then introduce a correction that provably removes this bias in certain cases. We apply our decomposition framework to three scenarios -- well-specified, misspecified, and corrected models -- to 1) choose between labeled and unlabeled data and 2) learn from their combination. We observe theoretically and with synthetic experiments that for well-specified models, labeled points are worth a constant factor more than unlabeled points. With misspecification, however, their relative value is higher due to the additional bias but can be reduced with correction. We also apply our approach to study real-world weak supervision techniques for dataset construction.

estimator, generalization error, misspecification, (15 more...)

arXiv.org Machine Learning

2103.02761

Country:

Asia > Japan > Shikoku > Kagawa Prefecture > Takamatsu (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Semiconductors & Electronics (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Learning From Labeled And Unlabeled Data: An Empirical Study Across Techniques And Domains

Chawla, N. V., Karakoulas, G.

Journal of Artificial Intelligence ResearchMar-1-2005

There has been increased interest in devising learning techniques that combine unlabeled data with labeled data - i.e. semi-supervised learning. However, to the best of our knowledge, no study has been performed across various techniques and different types and amounts of labeled and unlabeled data. Moreover, most of the published work on semi-supervised learning techniques assumes that the labeled and unlabeled data come from the same distribution. It is possible for the labeling process to be associated with a selection bias such that the distributions of data points in the labeled and unlabeled sets are different. Not correcting for such bias can result in biased function approximation with potentially poor performance. In this paper, we present an empirical study of various semi-supervised learning techniques on a variety of datasets. We attempt to answer various questions such as the effect of independence or relevance amongst features, the effect of the size of the labeled and unlabeled sets and the effect of noise. We also investigate the impact of sample-selection bias on the semi -supervised learning techniques under study and implement a bivariate probit technique particularly designed to correct for such bias.

dataset, learning, unlabeled data, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1509

AI Access Foundation

10404

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > District of Columbia > Washington (0.14)
(12 more...)

Genre: Research Report > Experimental Study (0.92)

Industry:

Banking & Finance (0.46)
Education (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)

Add feedback

Probabilistic Modeling for Face Orientation Discrimination: Learning from Labeled and Unlabeled Data

Baluja, Shumeet

Neural Information Processing SystemsDec-31-1999

This paper presents probabilistic modeling methods to solve the problem of discriminating between five facial orientations with very little labeled data.

dependency, pixel, unlabeled data, (12 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Probabilistic Modeling for Face Orientation Discrimination: Learning from Labeled and Unlabeled Data

Baluja, Shumeet

Neural Information Processing SystemsDec-31-1999

This paper presents probabilistic modeling methods to solve the problem of discriminating between five facial orientations with very little labeled data.

dependency, pixel, unlabeled data, (12 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Probabilistic Modeling for Face Orientation Discrimination: Learning from Labeled and Unlabeled Data

Baluja, Shumeet

Neural Information Processing SystemsDec-31-1999

This paper presents probabilistic modeling methods to solve the problem of discriminating betweenfive facial orientations with very little labeled data.

artificial intelligence, dependency, machine learning, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback